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ABSTRACT 

This paper describes a new method of sensor failure 
detection, isolation, and accommodation using a neural 
network approach. In a propulsion system such as the 
Space Shuttle Main Engine, the dynamics are usually 
very complicated and sometimes not well known. 
However, the number of variables measured is usually 
much higher than the order of the system. This built-in 
redundancy of the sensors can be utilized to detect and 
correct sensor failure problems. The goal of the 
proposed scheme is to train a neural network to identify 
the sensor whose measurement is not consistent with 
other sensor outputs. Another neural network is trained 
to recover the value of critical variables when their 
measurements fail. Techniques for training the network 
with a limited amount of data are developed. The 
proposed scheme is tested using the simulated data of 
the Space Shuttle Main Engine (SSME) inflight sensor 
group. 

INTRODUCTION 

In 1980, a ground test of the Space Shuttle Main 
Engine (SSME) experienced an erroneous measurement 
of the Main Combustion Chamber pressure (Pc) [1]. Pc 
is used for the closed loop thrust level control as well 
as closed loop mixture ratio calculations. The failed 
sensor reading led the testing to a severely abnormal 
operating condition. An internal fire and subsequent 
explosion occurred as a result of the sensor failure. 
The engine was virtually destroyed. Also, during the 
course of the Space Shuttle program there have been 
numerous incidents of sensor failures which caused 
component damage, unnecessary shutdowns and delays 
of the program. 

In order to improve the operational reliability it is 
necessary to validate the measured sensor data, isolate 
any failed sensor and recover the failed critical 
measurement. There has been an extended effort in 
applying analytical redundancy to the sensor failure 
detection and isolation in the jet engine failure 


diagnosis problem [2]. In general, this approach 
utilizes the engine model and the Kalman Filter to 
detect and isolate sensor failures. This technique is 
strongly dependent upon a reliable system model which 
may not always be attainable in a complex system. 

This paper proposes that neural networks be trained 
by experimental data and learn the relationships among 
the redundant sensors. These networks are then used to 
check the validity of the sensor readings and provide an 
estimated value for failed sensors. This paper will first 
describe some of the system dynamics of the Space 
Shuttle Main Engine. The selection and the training 
algorithms of the neural networks are then presented, 
followed by the simulation results of the proposed 
approach. Finally, a discussion of the research is 
presented. 

The SSME DYNAMICS 

The Space Shuttle Main Engine under study is by 
far the most complicated and power intense madiine 
among propulsion engines. A simplified description of 
the system operation follows [3,4]. There are three 
main engines in a space shuttle oibiter. Each engine 
produces a sea level thrust of 375,000 lb and a vacuum 
thrust of 470,000 lb. A schematic diagram of the 
propellant flows is shown in Fig. 1. Pressurized fuel, 
provided by the fuel tank, flowing through the low 
pressure fuel pump and the high pressure fuel pump, is 
fed to the regenerative cooling and the prebumers. A 
pressurized oxidizer tank provides the oxidizer which 
flows through the low pressure oxidizer pump and the 
high pressure oxidizer pump where the output flow 
splits into the two prebumers and the main combustion 
chamber as shown in Fig. 1. 

The dynamics of the system operation include: (1) 
the performance of turbopumps; (2) the heat exchange 
of the cooling flows: (3) the combustion of the two 
prebumers and the main chamber; (4) the control valve 
actions; and (5) the energy properties of oxygen and 
hydrogen in different phases. Most of these dynamic 
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properties are based on empirical data 
and are highly nonlinear. For example, 
a hydrogen energy property table has to 

be used to calculate the relationships 
among the internal specific energy, the 
pressure, the temperature, and the 
density at a given state. In order to 
demonstrate the complexity of the 
system, the dynamics of a typical hot 
gas turbine, which represents only a 
small portion of the whole SSME 
system, is shown here. Given upstream 
pressure P y , the upstream temperature 
Tu, the downstream pressure Pd» the gas 
constant R, the specific heat constant 
Cp, the rotational speed S, the specific 
heat ratio y, the flowrate DW and the 
empirically determined turbine 
performance map f H0 (*) in Figure 2, 
the available torque T and downstream 
temperature T d can be calculated by 
following equations [3]. 



Figure 1. SSME Propellant Flow Schematic 
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From the dynamics described above it can be seen 
that there exist certain defined relationships among 
these measured variables, although these relationships 
may be complicated. Further analysis reveals that 
analytical redundancy does exist, i.e. an unknown 
variable can be estimated using other related variables. 
However, with four turbopumps and three combustors 
operating simultaneously, it is extremely difficult to 
design a Kalman Filter type estimator for any selected 
measurement without grossly simplifying the dynamics. 


SSME SENSOR GROUPS 
There are hundreds of sensors used to collect on- 
line operational data. However, only 21 of them are 
used for inflight control/ shutdown purposes. These 
sensors include: speed sensors for three of the 
turbopumps; a fuel flowmeter, a pressure sensor for the 
main combustion chamber (MCC); pressure and 



Figure 2. Hot Gas Turbine Performance Map 


temperature sensors for the cooling ducts; and pressure 
and temperature sensors for the selected pump and 
turbine inlet and discharge points. Among these 
sensors only MCC pressure, high pressure fuel pump 
(HPFP) inlet flow, HPFP inlet pressure and HPFP inlet 
temperature are used for controlling the engine 
performance. The rest of the inflight sensors are used 
to monitor the operating condition and to activate the 
engine shutdown when the red-line condition is 
detected. 

In order to simplify the problem, the scope of this 
study is limited to the sensor failure detection during 
the nominal operating condition. The study can be 
easily extended to detect sensor failures for abnormal 
operating conditions if data for these conditions is 
obtained. Also, only the single sensor failure problem 
is addressed because we assume that the simultaneous 
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sensor failure situation is not likely to occur and 
consecutive sensor failures can be handled by cascading 
single sensor failures. 

From the analysis of the dynamic relationships of 
these selected measurements, an "influence sensor map" 
can be constructed. This "influence sensor map" is the 
simplified description of how a measurement can be 
directly influenced by other measurements. Again, this 
relationship may be complicated and not intuitive even 
to an expert. Among the SSME sensors, there are two 
closely related measurement clusters, one for the fuel 
system and one for the oxidizer system. Figure 3 
shows the "influence sensor map" for the fuel system. 
This cluster of measurements will be used to study the 
sensor failure detection and signal reconstruction using 
network computing. The sensors selected here are: 

P6: Main Combustion Cooling Pressure 

T6: Main Combustion Cooling Temperature 

Qfdl: Low Pr. Fuel Pump outlet flow, in volume 
Pfdl: Low Pr. Fuel Pump outlet Pressure 

Tfdl: Low Pr. Fuel Pump outlet Temperature 

Pfd2: High Pr. Fuel Pump exit Pressure 

Tft2d: High Pr. Fuel Turbine Downstream Pressure 
Sfl: Low Pr. Fuel Turbopump Speed 

Sf2: High Pr. Fuel Turbopump Speed 

Pc: Main Combustion Chamber Pressure 

The sensor failure detection and signal 
reconstruction problem can be restated as: 

For a given set of measurements at any time 

instant: 

1. identify the measurement which is not consistent 

with others 

2. estimate the value for the identified failed sensor. 

NEURAL NETWORK SELECTION 

The neural network structure selected for this task 
is a multilayer feedforward network with the sigmoidal 
activation function for each node (Figure 4) [5]. There 
are two networks to be trained for the two described 
functions: failure detection and lost variable estimation. 
The first network is to detect inconsistent sensor 
readings. The neural network usea in this simulation 
consists of 10 input nodes, 30 first hidden layer nodes, 
30 second hidden layer nodes, and 10 output nodes. 
The normalized sensor values are applied to the input 
nodes. The 10 output nodes on the final layer represent 
the confidence levels of the 10 corresponding sensor 
readings. The functional requirement of this network is 
to process a given set of normalized sensor 
measurements and generate a list of confidence 
indicators for the sensor readings. For example, for a 
good set of sensor readings, the output of this neural 



Figure 3. Influnce Sensor Map of SSME Fuel System 



Hidden Layers 

Figure 4. Feedforward Neural Network Architecture 

network is expected to have high confidence indicators 
(values close to 1) for all sensors. If there is a sensor 
failure, this network’s output shall be an indication of 
the low confidence (a value close to 0) in the failed 
sensor while indicating high confidences in other 
sensors. 

The second neural network is to perform the 
recovery of the measurement due to the failed sensor. 
In this particular example, the network will use the 
other nine measurements to estimate the collapsed 
sensor reading identified by the previous network. The 
network selected here also has two hidden layers. The 
network chosen for the simulation consists of 9 input 
nodes, 30 first hidden layer nodes, 30 second hidden 
layer nodes, and a single output node for the sensor 
variable to be recovered. Usually, only the sensor 
readings that are used in the control loop need to be 
recovered. 
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TRAINING THE NETWORKS 
In this study, the Digital Transient Model [3] is 
used to simulate the dynamic behavior of the system. 
During start up, the main engine power level reaches 
100 percent power within 4 seconds. Since this 
transient curve covers a very wide range of operation, 
it is assumed that the information gathered during this 
time period is rich enough to train the neural networks 
for both sensor failure detection and failed sensor 
recovery. Only data from a normal operation is used 
in this study. Also, the data for the first second of 
engine operation are discarded because most of the 
measurements stay constant during that time period. 
The data samples are recorded at the rate of 50 Hz. In 
total, there are 150 sets (3 seconds) of sensor readings 
available for the neural network training. 

A. Training for Sensor Failure Detection 

As previously described, the purpose of this network 
is to single out the sensor reading which is not 
consistent with the other measurements. For a given set 
of sensor readings we can establish a range for each 
sensor which we consider "normal". These ranges can 
usually be established by combining the experts’ 
knowledge about the process, the sensor characteristics, 
and the historical data base. Once the range of each 
measurement is selected, the goal states of the output 
nodes can easily be determined according to whether 
the measurement is within the range or not. A back- 
propagation algorithm is used here to train the neural 
network. The randomized iteration sequence is 
described as follows: 

(1) randomly select one of the 150 sets of sensor 
readings, 

(2) randomly select one of the 10 sensors to be trained, 

(3) generate a random Gausian noise co with zero mean 
and standard deviation o = 1.5 e,, where ±e, is the 
valid range for the ith sensor reading S,. Add the 
noise co to Sj to create a new sensor reading S**. 
This selection of noise generates about 50% out-of- 
range training samples. 

(4) if Sj* is within the valid range of Sj then set the 
desired output Oj of the neural network to 0.9, 
otherwise set it to 0.1, also set all other desired 
outputs to 0.9, 

(5) adjust the weights according to the back- 
propagation algorithm, 

(6) repeat steps (1) to (5) until the network can reliably 
indicate the failed sensor for any given situation. 

B. Training for Failed Sensor Recovery 

When a critical sensor reading is found to be false, 
it is necessary to estimate its value using other 


correlated measurements. A simple approach is to have 
one estimation network for each failed sensor that needs 
to be recovered. This network will have n- 1 input 
nodes and 1 output node. Given the normal operation 
data set, the training is straight forward. The 
performance of the trained network is usually excellent. 
The training algorithm for the estimation of ith sensor 
is: 

(1) randomly select one of the 150 sets of sensor 
readings, 

(2) apply the other 9 sensor inputs to the network, 

(3) calculate the error Ej = (S { - Oj) for the back- 
propagation training, 

(4) adjust the weights of the network according to the 
back-propagation algorithm, 

(5) repeat steps (1) to (4) until the result of the 
estimation is acceptable. 

Due to the redundancy of these selected sensors, it 
is expected that there is a certain degree of similarity in 
the estimation networks for different sensors. Thus, it 
may be much more efficient to have one estimation 
network that can estimate any selected missing variable. 
A single network to recover all variables in the SSME 
fuel system will have 10 input nodes and 10 output 
nodes. The training algorithm is more complicated and 
the performance is not as good as a single sensor 
estimator. Here, we limit our scope to the single 
estimator only. 

SIMULATION RESULTS 
As described in the previous section, the data 
generated by the Digital Transient Model (DTM) is 
used in the simulation. Initially, the data collected 
during the start up transient (i.e. 1.0 < Time < 4.0 
seconds) are used to train the neural networks. The 
first network is trained to capture the relationship 
among the measurements so that a failed sensor can be 
identified. Variable step size for the weight 
adjustments is used to help fine tune the network for 
better performance. Figure 5 shows the percentage of 
error during the training of the network. The error 
percentage is calculated for every 2,500 training 
iterations. It can be seen that the network is able to 
reach more than 90% accuracy in predicting any given 
sensor failure after about one million samples. Further 
fine tuning has reduced the error to less than 5%. 
These errors occur in the neighborhood of the defined 
cutoff values of valid sensor readings. Because of the 
continuous nature of the selected network it is 
reasonable to have a gray area which indicates that the 
sensor failure is "uncertain". A second network which 
is to recover the Main Combustion Chamber Pressure 
measurement is also trained using the start up transient 
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data. The input to this network consists of the signals 
from the other 9 sensors and its only output is the 
estimation of MCC pressure. The training is 
straightforward and the estimation results are within a 
few percent after several thousand iterations. 

These two networks were tested for an extended run 
of the DTM simulation. In this simulation, the 
controller starts the engine, powers the engine to 100% 
in four seconds, holds at 100% for one second, reduces 
the engine power to 65% in the next five seconds, holds 
at 65% for three more seconds, and finally gradually 
increases the power to 100%. This is to emulate the 
operation profile of the SSME during the so-called 
"Max-Q Throttle" operation. 



Case 1: HP FTP Speed Sensor Failure at T = 7.0 

Figure 6 shows the case that the High Pressure Fuel 
Turbopump speed sensor Sf2 starts failing at T = 7.0. 
The failure is a soft failure, i.e. a degraded reading. 
The rate of failure is -350 RPM per second off the 
actual value. It can be seen that the confidence 
checking network is able to detect the discrepancy 
within 0.5 seconds by indicating the confidence of that 
sensor is low (close to 0). Figure 7 shows the outputs 
of the network during the Sf2 sensor failure. It shows 
that the failed sensor can be clearly identified within a 
very short period of time after it started degrading. 

Case 2: MCC Pressure Sensor Failure at T = 8.0 

Figures 8 and 9 show the case in which the Main 
Combustion Chamber pressure sensor Pc starts failing 
at T = 8.0. The rate of failure is -300 PSI per second 
off the actual value. Figure 8 shows the outputs of the 
network which clearly indicate high confidence on all 
other sensors while singling out the Pc sensor failure. 
Figure 9 indicates that the confidence level of the 
measurement falls quickly from high (close to 1) to low 
(close to 0) when the measured value moves away from 
the real value. The on-line estimation of Pc using the 
second network is also shown in Figure 9. The 
estimated value of Pc closely follows the actual value 
and can be used for backup when the Pc sensor fails. 
This arrangement provides an uninterrupted and 
undegraded control after the sensor failure. 

CONCLUSIONS 

Neural networks are proposed to detect sensor 
failures and recover the lost measurements from a 
group of redundant sensors. A two step approach is 
employed. The first network is trained to detect the 
sensor which is inconsistent with other sensor readings. 
The second network is trained to recover the sensor 
readings which are critical in operation. This approach 
is especially useful when the relationship among these 
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Figure 5. Error vs. Training Iterations 
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Figure 6. SO Sensor Degraded at T = 7.0 Seconds 
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Figure 7. Outputs of the Conf. Checking Net in Case I 
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sensors can not be clearly identified or is too complex 
for a Kalman Filter estimator. The network can be 
trained using the experimental data for the selected 
condition. The dynamic relationship among the sensors 
is learned using the back-propagation algorithm. _ 

The proposed approach is applied to the Space & 
Shuttle Main Engine inflight sensor group through the o 
Digital Transient Model Simulation. The results clearly ® 
show the adequacy of the approach under the tested ^ 
condition. It is conceivable that the approach can be 
extended to cover other operating conditions if the z 
sensor data for those conditions are collected and 
applied to training. 

The high speed capability of neural networks makes 
the proposed approach even more attractive in the real- 
time control problem [6]. It was shown in this study 
that the sensor measurements used for control purposes 
can be easily recovered without delay. This feature is 
especially useful in the design of an Intelligent 
Controller where real-time diagnostics and 

accommodation is one of the key issues. 
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Figure 8. Outputs of the Conf. Checking Net in Case 2 
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Figure 9. Pc Sensor failure at T = 8.0 Seconds 
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